FastHiC: a fast and accurate algorithm to detect long-range chromosomal interactions from Hi-C data
نویسندگان
چکیده
MOTIVATION How chromatin folds in three-dimensional (3D) space is closely related to transcription regulation. As powerful tools to study such 3D chromatin conformation, the recently developed Hi-C technologies enable a genome-wide measurement of pair-wise chromatin interaction. However, methods for the detection of biologically meaningful chromatin interactions, i.e. peak calling, from Hi-C data, are still under development. In our previous work, we have developed a novel hidden Markov random field (HMRF) based Bayesian method, which through explicitly modeling the non-negligible spatial dependency among adjacent pairs of loci manifesting in high resolution Hi-C data, achieves substantially improved robustness and enhanced statistical power in peak calling. Superior to peak callers that ignore spatial dependency both methodologically and in performance, our previous Bayesian framework suffers from heavy computational costs due to intensive computation incurred by modeling the correlated peak status of neighboring loci pairs and the inference of hidden dependency structure. RESULTS In this work, we have developed FastHiC, a novel approach based on simulated field approximation, which approximates the joint distribution of the hidden peak status by a set of independent random variables, leading to more tractable computation. Performance comparisons in real data analysis showed that FastHiC not only speeds up our original Bayesian method by more than five times, bus also achieves higher peak calling accuracy. AVAILABILITY AND IMPLEMENTATION FastHiC is freely accessible at:http://www.unc.edu/∼yunmli/FastHiC/ CONTACTS: : [email protected] or [email protected] SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
منابع مشابه
3DIV: A 3D-genome Interaction Viewer and database
Three-dimensional (3D) chromatin structure is an emerging paradigm for understanding gene regulation mechanisms. Hi-C (high-throughput chromatin conformation capture), a method to detect long-range chromatin interactions, allows extensive genome-wide investigation of 3D chromatin structure. However, broad application of Hi-C data have been hindered by the level of complexity in processing Hi-C ...
متن کاملIdentification of copy number variations and translocations in cancer cells from Hi-C data
Motivation Eukaryotic chromosomes adapt a complex and highly dynamic three-dimensional (3D) structure, which profoundly affects different cellular functions and outcomes including changes in epigenetic landscape and in gene expression. Making the scenario even more complex, cancer cells harbor chromosomal abnormalities (e.g., copy number variations (CNVs) and translocations) altering their geno...
متن کاملStatistical confidence estimation for Hi-C data reveals regulatory chromatin contacts.
Our current understanding of how DNA is packed in the nucleus is most accurate at the fine scale of individual nucleosomes and at the large scale of chromosome territories. However, accurate modeling of DNA architecture at the intermediate scale of ∼50 kb-10 Mb is crucial for identifying functional interactions among regulatory elements and their target promoters. We describe a method, Fit-Hi-C...
متن کاملIntra- and inter-chromosomal interactions correlate with CTCF binding genome wide
A prime goal in systems biology is the comprehensive use of existing high-throughput genomic datasets to gain a better understanding of chromatin organization and genome function. In this report, we use chromatin immunoprecipitation (ChIP) data that map protein-binding sites on the genome, and Hi-C data that map interactions between DNA fragments in the genome in an integrative approach. We fir...
متن کاملCalculation of One-dimensional Forward Modelling of Helicopter-borne Electromagnetic Data and a Sensitivity Matrix Using Fast Hankel Transforms
The helicopter-borne electromagnetic (HEM) frequency-domain exploration method is an airborne electromagnetic (AEM) technique that is widely used for vast and rough areas for resistivity imaging. The vast amount of digitized data flowing from the HEM method requires an efficient and accurate inversion algorithm. Generally, the inverse modelling of HEM data in the first step requires a precise a...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Bioinformatics
دوره 32 17 شماره
صفحات -
تاریخ انتشار 2016